feat: Enable configurable model streaming support#1202
Merged
Conversation
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io> Co-authored-by: Azeez Syed <syedazeez337@gmail.com>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
5316147 to
c52dbec
Compare
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR enables configurable LLM response streaming support for agents. It extends the OpenAI provider to properly handle streaming with partial field aggregation and function calling, propagates partial event metadata to filter streaming chunks from persisted history, adds a UI toggle for users to enable/disable streaming, and includes e2e tests.
Changes:
- Added
streamboolean field to agent configuration at API, type, and UI levels with default value of false - Extended OpenAI model provider to aggregate streaming chunks and handle tool calls during streaming
- Added metadata filtering to prevent partial streaming events from being persisted in task history
- Added UI checkbox to enable/disable LLM response streaming with descriptive text
Reviewed changes
Copilot reviewed 37 out of 37 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| ui/src/app/agents/new/page.tsx | Added stream toggle UI control and state management |
| python/packages/kagent-core/src/kagent/core/a2a/_task_store.py | Added filtering logic to remove partial streaming events before persisting |
| python/packages/kagent-adk/src/kagent/adk/types.py | Added stream field to AgentConfig with documentation |
| python/packages/kagent-adk/src/kagent/adk/models/_openai.py | Implemented streaming aggregation for text and tool calls |
| python/packages/kagent-adk/src/kagent/adk/converters/request_converter.py | Added stream parameter to configure ADK StreamingMode |
| python/packages/kagent-adk/src/kagent/adk/converters/event_converter.py | Added adk_partial metadata to messages for filtering |
| python/packages/kagent-adk/src/kagent/adk/cli.py | Added stream config loading and UVICORN_LOOP parameter |
| python/packages/kagent-adk/src/kagent/adk/_agent_executor.py | Added partial event filtering in aggregator |
| python/packages/kagent-adk/src/kagent/adk/_a2a.py | Added stream parameter to KAgentApp |
| helm/kagent-crds/templates/kagent.dev_agents.yaml | Updated stream field documentation to default false |
| go/test/e2e/invoke_api_test.go | Added streaming e2e test and simplified Stream field usage |
| go/internal/controller/translator/agent/adk_api_translator.go | Added Stream field translation from CRD to config |
| go/internal/adk/types.go | Added Stream field to AgentConfig |
| go/config/crd/bases/kagent.dev_agents.yaml | Updated CRD with stream field documentation |
| go/api/v1alpha2/zz_generated.deepcopy.go | Removed pointer handling for stream field |
| go/api/v1alpha2/agent_types.go | Changed Stream from *bool to bool with false default |
| go/internal/controller/translator/agent/testdata/* | Updated test outputs with stream field set to false/true |
Comments suppressed due to low confidence (1)
go/internal/adk/types.go:281
- The UnmarshalJSON method doesn't unmarshal the ExecuteCode and Stream fields. This means when an AgentConfig is unmarshaled from JSON, these fields will always be their zero values (false). Add ExecuteCode and Stream to the tmp struct and copy them to the AgentConfig after unmarshaling.
func (a *AgentConfig) UnmarshalJSON(data []byte) error {
var tmp struct {
Model json.RawMessage `json:"model"`
Description string `json:"description"`
Instruction string `json:"instruction"`
HttpTools []HttpMcpServerConfig `json:"http_tools"`
SseTools []SseMcpServerConfig `json:"sse_tools"`
RemoteAgents []RemoteAgentConfig `json:"remote_agents"`
}
if err := json.Unmarshal(data, &tmp); err != nil {
return err
}
model, err := ParseModel(tmp.Model)
if err != nil {
return err
}
a.Model = model
a.Description = tmp.Description
a.Instruction = tmp.Instruction
a.HttpTools = tmp.HttpTools
a.SseTools = tmp.SseTools
a.RemoteAgents = tmp.RemoteAgents
return nil
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Signed-off-by: Jet Chiang <pokyuen.jetchiang-ext@solo.io>
EItanya
approved these changes
Jan 13, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Original PR: #1161, close #1099
Screen.Recording.2026-01-09.at.7.32.42.PM.mov
LmResponsepartial field, aggregating results, and function calling. I tested that this works with other model providers as well like Gemini.Event.partialat the A2AMessagelevel so that the task store will filter them out and avoid persisting partial chunks (only saveTaskStatusUpdateEventthat contain complete chunks).Tasklevel in_get_context_metadataand reconstruct the event in the client side, but currently we do it at the Message level since it makes more sense. There is a discussion from ADK Python on this regarding their A2A-ADK integration.stream: truesupported by the new MockLLM